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Abstract 

Evolutionary biology shares many concepts with statistical physics: both deal with populations, 
whether of molecules or organisms, and both seek to simplify evolution in very many dimensions. 
Often, methodologies have undergone parallel and independent development, as with stochastic 
methods in population genetics. We discuss aspects of population genetics that have embraced 
methods from physics: amongst others, non-equilibrium statistical mechanics, travelling waves, 
and Monte-Carlo methods have been used to study polygenic evolution, rates of adaptation, and 
range expansions. These applications indicate that evolutionary biology can further benefit from 
interactions with other areas of statistical physics, for example, by following the distribution of 
paths taken by a population through time. 
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Parallel foundations of evolution and statistical physics 

In the late 19th century, Boltzmann established the theoretical foundations of statistical me- 
chanics, in which the behaviour of ensembles of particles explains large-scale phenomena Q. 
For example, the position and velocity of the particles in a gas can fluctuate between very many 
states (termed micro-states), but averages over all the configurations that give the same observ- 
able macroscopic states (temperature and pressure, say) [2]. A similar averaging over equivalent 
micro-states is made in both population and quantitative genetics: we average over individual 
gene combinations to describe a population by its allele frequencies, and we can further average 
over all the allele frequencies that are consistent with a given mean and variance of a quanti- 
tative trait. In this sense, physicists and evolutionary biologists both model populations (a gas 
or a gene pool) rather than precise types (individual particles or genotypes). This "statistical" 
description in terms of a few variables, the macro-states, summarizes the many possible con- 
figurations of the micro-states (degrees of freedom), which cannot be accurately measured or 
described. Furthermore, the macro-states are then sufficient to predict other properties without 
reference to the micro-states. For example, thermodynamics describes macroscopic properties 
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without referring to individual particles; similarly, quantitative genetics does not refer to allele 
frequencies to predict the trait mean in the next generation. 

Hence, evolutionary biology and statistical physics often use similar theoretical methodolo- 
gies, although studying very different phenomena. We argue that there are close analogies be- 
tween evolutionary genetics and statistical physics. Physical techniques had an early influence 
on molecular biology (Appendix Ai. But more specifically, non-equilibrium methods are based 
on the same theory of stochastic processes that is used in population genetics. Thus, some phys- 
ical theories promise further developments that can deepen our understanding of evolution in 
two ways: either by applying common mathematical techniques (e.g. diffusion equations, see 



Appendix B i, or by developing precise analogies that incorporate new concepts (e.g. ensemble 



averaging, information, or entropy). These techniques are being applied in different aspects of 
evolutionary biology. This article focuses mainly in those that we consider most promising for 
population genetics. We aim to introduce to the reader to these methods by reviewing represen- 
tative examples in the literature. 



The cost of selection, entropy and information 

It is extraordinary that the selection of random mutations has created complex organisms that 
appear exquisitely designed to fit their environment. Selection can be seen as taking information 
from the environment, and coding it into the DNA sequence [3|: thus, the gene pool contains 
information about those specific sequences that confer high fitness. This idea can be quantified 
using the concepts of entropy and information |4|. Entropy is a measure of the number of differ- 
ent states in which a population is likely to be found: thus, selection of one specific genotype, or 
genotype frequency, corresponds to minimal entropy. A asking how the genotype of an individ- 
ual, or a population, depends on the selection that they have experienced, can be quantified by 
an entropy that measures how strongly selection has clustered the population around a specific 
genotype. Haldane Q showed that the number of selective deaths needed to fix an allele is 
independent of the selection pressure, and Kimura [6 1 pointed out that Haldane's "cost of natural 
selection" is exactly the information gained by fixing a specific allele. This relation applies very 
generally to asexual populations [7| but fails with recombination (see below). The theory of 
quasispecies (a model of mutation-selection balance), emphasizes that the reproductive rate (se- 
lection) limits the amount of information that can be maintained in the face of random mutations 
|8|. However, this constraint can be relaxed with recombination and epistasis |9|. Analogies 
with statistical physics help us to understand how selection accumulates information. We first 
consider infinite populations - evolving deterministically - and then the more general case where 
random drift in finite populations drives evolution. 

Deterministic evolution and the role of recombination 

The information content of a single large population that is evolving deterministically is mea- 
sured by the entropy, defined as S = Y*x Px l°g(Px) where p x is the frequency of alleles or geno- 
type x J5] ID- This entropy reflects the information accumulated and maintained by evolution, 
and is closely related to Shannon's information [3-4]. Remarkably, the replicator dynamics can 

be obtained by maximizing a different measure, Fisher information: F = YixPx\j t l°g(p.r)) > 
which measures divergence between two distributions IfTOlfTTI . Fisher's information is the "ac- 
celeration" of the entropy, i.e. d 2 S/dt 2 - F, where we interpret Jj log(p x ) as the information 
contained about selection when we observe how the frequency of x has changed. For example, 
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for a beneficial allele under selection, this would be proportional to the selective value s. Nor- 
mally, we predict the change of the frequency distribution when we know s. Fisher's information 
takes the parent and the offspring distributions as given, and measures the effect of selection from 
the difference between these two ifTTll . 

Sexual vs. asexual reproduction. It has long been understood that, when combined with trun- 
cation selection, sexual reproduction is much more efficient than asexual in fixing beneficial 
genotypes |[T2l [131 Ch. 2]. In the former case, the maximum information increases as n 1 ^ 2 (n 
being the number of loci), while in the latter, it increases by only one unit per generation Bl PBl . 
The maximum number of loci that can be maintained despite the randomizing effect of mutation 
is 1 //j for asexuals lfT4l . whilst for sexuals (with free recombination) it can be as high as 1 //j 2 , 
where /u is the mutation rate at each locus lfT4l[T5ll . (More loci would produce more mutants, and 
hence decrease the amount of information). According to Haldane's principle 1 16 1 every delete- 
rious mutation must be eliminated by a failure to reproduce (a "selective death"). Therefore, the 
mutation load is independent of selection strength, and is half as great if selection eliminates two 
copies in a recessive homozygote at the same time. In haploids, redundancy leads to a similar 
gain in efficiency |[T5l[T7l . 



Stochastic evolution: the diffusion of allele frequencies 

The diffusion approximation shows how the distribution of allele frequencies at many loci 
changes through time in finite populations. In this case, selection, mutation and migration are 
modelled as deterministic factors, and genetic drift introduces random fluctuations to populations 



within an ensemble ( Appendix B I. (Other treatments are possible, where mutations are regarded 



as carrying random changes to individuals within a single population.) In other fields, constant 



diffusion coefficients have been widely used, leading to simple Gaussian solutions (Appendix B i. 
However, Gaussian solutions are not appropriate for population genetics, because allele frequen- 
cies range between zero and one, and sometimes cluster near fixation, in a bimodal distribution. 
After its introduction by Fisher [18], Kolmogorov applied the more general diffusion method to 
the neutral island model |fI91 , which Wright l20l had already solved by different means. Kimura 
relied on the diffusion approximation to model the evolution of finite populations ETIl . and for 
his neutral theory of molecular evolution l22l . 

The diffusion approximation, central to both population genetics and statistical physics, pro- 
vides a way to model many factors in a mathematically tractable way. Crucially, it approximates 
a wide variety of more detailed models. Mathematically, it is equivalent to the coalescent pro- 
cess that describes the evolution of samples from a population, and to path ensemble methods 
that describe the distribution of population histories (see below). In physics, diffusion equa- 
tions describe non-equilibrium processes and are hard to relate to quantities like temperature, 
entropy, or free energy, which are well-defined only in thermodynamic equilibrium through the 
Boltzmann distribution. 

Wright showed that selection, mutation and drift give an explicit distribution, proportional to 
W 2N , where W is the mean fitness of a population of size N 11201 . This is closely analogous to the 
Boltzmann distribution (~ e~ £ ^ r ) B[2j, with log(W) corresponding to (negative) energy, — E, 



and 1/2N to the temperature, kT (Appendix B i. This result was the basis for Wright's metaphor 



of an adaptive landscape: a surface of mean fitness laid over the multidimensional space of allele 
frequencies [23] (Appendix B and Fig . [TJ . 
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Jumps between adaptive peaks. When the stationary distribution is clustered around alter- 
native peaks in the adaptive landscape, the rate at which random drift causes shifts between these 
states is approximated by a general formula that is proportional to the probability of being at the 
saddle point (adaptive valley) that separates them, and to the leading eigenvalue that describes the 
instability at that point Il24ll25l . Wright [25 1 worked out transition rates for chromosome rear- 
rangements, ideas rigorously formulated later using diffusions [26 1. Rouhani and Barton [24ll27l 
found the rate of peak shifts in a spatially structured population, borrowing from an identical 
analysis of transitions between alternative vacuum states. 

Traveling waves. The distribution of a quantitative trait, or of fitness itself, can be seen as a 
traveling wave that travels at a steady rate as the population adapts, either in actual or in pheno- 
typic space. Most analyses have been of asexuals, which increase their fitness by accumulation 
of favourable mutations, or decline under Muller's ratchet l28l [29l l30l . Beneficial mutations 
increase in frequency independently at the wave front, where frequencies are low and subject to 
drift, but the rest of the wave follows deterministically [31]. The wave thus moves at a velocity 
proportional to the mutation rate, and which depends logarithmically on the population size be- 
cause of strong random drift at the leading edge ||28l . This approach has been extended to low 
rates of recombination 11321 l33l . With sexual reproduction, random drift has much less effect, 
and the population adapts much more quickly QUI [34l . However, when there is a very high rate 
of substitution and recombination, Hill-Robertson interference limits adaptation rate [35|. 

Spatial evolution and range expansions. Fisher introduced a simple non-linear diffusion equa- 
tion describing the spread of a beneficial mutation through space [36 1. Though motivated by an 
evolutionary problem, this model raised interest among physicists and mathematicians (estab- 
lishing a sub-discipline studying the Fisher-KPP model -for Kolmogorov-Piskounov-Petrovski, 
co-discoverers of the model [37]). Travelling waves explain the decreased genetic diversity that 
arises from hitchhiking at the leading edge [38 , 39 1. This approach also provides a practical 
way to measure selection coefficients HUl . and perhaps, a means to distinguish fixation due to 
selective sweeps from simple drift BT1I421 . 



Statistical mechanics and the quantitative genetics of finite populations 

Although the diffusion equation provides an exact description of evolution, the joint distri- 
bution at many loci is hard to grasp. Statistical mechanics simplifies the problem by following 
just a few variables that summarize all the allele frequencies (or in physics, the particles' states). 
These map the fitness landscapes for allele frequency onto a simpler one for quantitative traits 



||43l , which are analogous to macroscopic quantities in statistical physics (Appendix B Fig. 

0111. 

Maximization of entropy. This reduction in dimensionality requires a way to account for the 
degrees of freedom lost in averaging over the underlying genetic states. This can be achieved 
by applying the principle of entropy maximization: we assume that the unknown micro-states 
follow a distribution that maximizes their entropy, S , given the values of macroscopic quantities 
P31 . Entropy can be defined in several ways. The definition appropriate here is analogous to 
the above, but extends to the case when p (the vector of allele frequencies at each locus) is 

the random variable: S — — J if/\og[tf//if]dp. This defines the dispersion of the distribution of 
allele frequencies, if/, relative to a base distribution, <p: it is maximized when the distribution 
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is selectively neutral (tff = <p) and decreases as the distribution becomes more tightly clustered 
around states that are a priori improbable R6l l47l l48l . If S is maximized whilst constraining the 
expectations of some macroscopic variables, (A,) = J Ajtfrdp, we obtain a distribution of allele 
frequencies if/ = Z <pexp[2N ^ aiAj] [46 1, where Z normalizes the distribution and N is the 
population size. Remarkably, this distribution corresponds exactly to the stationary solution of 
the diffusion equation ( |Appendix B[ ), when the A's are chosen according to the particular mode 
of selection (quantitative traits, genetic variance, etc.) and heterozygosity, and are conjugated 
with the a's , which are the selection coefficients, mutation rates, etc. B6ll47ll48ll . This analogy 
between statistical mechanics and evolution of a finite population has yielded several results, of 
which we will mention a few. 

The dynamics of polygenic evolution can be approximated by a quasi-equilibrium assumption, 
that is, that the transient distribution of allele frequencies behaves as if the entropy is maximized 
at all times, given the current values of macroscopic variables. In this way, the change through 
time of quantitative characters - including their genetic variance - can be computed for popula- 
tions affected by mutation, selection and drift, for an arbitrary number of loci [46 49 ]. In physics, 
macroscopic systems often change far more slowly than the microscopic fluctuations, justifying 
this approximation. In biology, we do not have such a stark separation. But nevertheless, the 
approximation is remarkably accurate even when the environment changes abruptly [46 471 [49l ; 
traveling waves may provide an explanation ||3T1 



Adaptive landscapes and detailed balance. Wright's formula for the stationary distribution 
l20l requires detailed balance [50|. Population geneticists have shown that detailed balance is 
generally violated when there are more than two alleles at a locus 1201 . when recombination or 
migration are comparable with the strength of selection, or under frequency-dependent selection 
llBD . Without detailed balance, the dynamics cannot be represented by an adaptive landscape, 
and can mathematically intractable (though see 11521 ). Phylogenetic analysis reveals deviations 
from detailed balance - for example, when genomic GC content changes over time ll53l . So, 
we need methods for analyzing populations that are in a stationary state that violates detailed 
balance, or that are not at a statistical equilibrium at all. 



Path ensembles. An alternative method that holds without detailed balance is thepath ensem- 
ble ll24l l54l . Instead of describing the distribution of allele frequencies at any single time, we 
follow the distribution of paths of allele frequencies between two states at different time-points 
(Fig. [3J. The probability of any path can be written down in a simple form, and the chance of a 



transition from one state to another obtained (in principle) by integrating over paths ( Appendix 



|B") , The trajectories are weighted with respect to an optimal one, through three terms: Fisher's in- 
formation, the variance in fitness, and the fitness flux, <p (Appendix C I f54l . The latter measures 
the net amount of adaptation given a population's history. It is defined as <p = , where s is 
the selective coefficient of the beneficial allele; (f> is the increase in mean fitness that is expected 
from changes in allele frequency -but without allowing for changes in selection. The fitness flux 
is distinct from the change in mean fitness, which in general is not well-defined when selection 
changes through time. Fitness flux includes changes in allele frequencies due to all evolutionary 
processes, and to the extent that these interfere with selection, can be negative. In considering the 
history of a population, the path ensemble methods give an understanding of the adaptation and 
evolution of complex traits that accounts for historical contingencies, an advantage over models 
that only consider a population's state at a given time. 
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Evolutionary biology and Monte-Carlo methods 



Monte Carlo methods are now widely used in statistical inference. When many variables are 
involved it is not feasible to explore the whole space of possible states (e.g. all possible phyloge- 
netic trees amongst multiple species). A group working on nuclear weapon development at Los 
Alamos introduced a simple but widely used algorithm [55 1. One simply makes a random change 
to the microscopic variables, accepting it if it increases some measure, L (for example, mean fit- 
ness). Changes that decrease L to L* , are accepted with probability L* /L. This ensures that the 
microscopic variables will follow a distribution proportional to the stationary distribution of the 
diffusion equation, which would in turn, be determined solely by the random changes, multiplied 
by L (Fig. [T}. This Metropolis algorithm has been developed in a statistical context ll56*ll . and 
applied to generate likelihood surfaces for statistical inference |}57l [58 ] |59l . Intriguingly, this 
algorithm uses a simple form of selection to generate a distribution equal to the product of a 
neutral base distribution, and the measure L - just as selection and random drift lead to Wright's 
distribution under the diffusion approximation (see Appendix B I. Both rely on detailed balance, 
but a path ensemble approach allows extension to more general cases] 24 1. 



Obstacles to overcome 

Toy models and method-oriented analyses. Over the last decade, physicists have shown strong 
interest in evolution. For example, in the last five years, over 2000 publications on evolution 
appeared in physics journals (chiefly Physical Review journals, Physica A, and PNAS). Unfortu- 
nately, most of these works pay little attention to the fundamental biology, because the motivation 
is often the specific methods rather than the biological questions. Consequently, many of these 
contributions remain unconnected to the rest of the evolutionary theory; for the most part, there 
is very little communication between the disciplines. Two examples follow. In the Bak-Sneppen 
model 1 60 1, populations evolve by removing the least fit individual together with two unrelated 
neighbours, and replacing them by three new individuals with random fitness. A "critical value" 
is reached, but with repeated periods where the fitness distribution spreads, and then re-organizes 
to the critical value. The Bak-Sneppen model attempted to explain the distributions of extinc- 
tion episodes [61], and patterns of experimental evolution [62|, but had little impact in biology 
because it lacks any mechanistic basis. Notably, only 13 of 700 citations of the Bak-Sneppen 
model [60] were by non-physicists. Second, in the Penna bit-string model of ageing 11631 . the 
position in the genome of an allele dictates the age at which its detrimental effect is expressed. A 
threshold for the total number of such deleterious mutations is set arbitrarily, and the population 
evolves under mutation and competition. Senescence arises because selection is less effective in 
late life -a phenomenon already well-understood from Hamilton's general analysis [64|. Here, 
out of roughly 230 to ref. [63], only 5 did not include physicists. These two approaches, and 
others alike, are not taken seriously since they rest on "toy models" that are not connected with 
biological reality. 

Two problems that restrict communication between disciplines. First, the language and 
nomenclature employed by physicists are often not consistent with basic concepts in genetics: 
they employ terms such as energy, spin glass, magnetization, Ising chain, etc. where they should 
use mean fitness, polygenes, directional selection, or polygenic trait [65, 66 47] [67). Standard 
population genetics notation is largely ignored, making even the most basic equations appear 
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unfamiliar. To take a central example, the diffusion equation includes deterministic and stochas- 
tic "forces". In evolution, the stochastic part models genetic drift. However, the term "drift" is 
used in physics to refer to the deterministic part! Different nomenclatures make it difficult for 
physicists to address important biological questions, and for biologists to understand the ques- 
tions posed by physicists. This is amplified when new ideas are introduced. For example, in an 
explanation of the advantages of sex, the idea of mixability was introduced [68 1: i.e. sex favours 
alleles that are fit across different genetic backgrounds. A recently proposed measure of mixabil- 
ity" l68l is identical to Fisher's analysis of variance, which was devised precisely as a measure 
of epistasis |69|. Take another example: a statistical mechanics approach was used to find the 
distributions of contributions made by individual ancestors to future generations. This defined 
the statistical "weight" of each individual's contribution in a lineage |70|, which, in biological 
terms, is just the reproductive value of an individual - again, a concept introduced by Fisher IPTD . 

Second, known results are often rediscovered due to the lack of a common language. For 
example, the original result that free fitness increases in evolution was illustrated with several 
examples from population and quantitative genetics, and was interpreted in terms of selection 
and drift ll48l . Yet, the same principle was twice rediscovered by physicists decades later but 
with more restricted scope |50| Another example is the NK model, where the fitness landscape 
can be"tuned", altering the degree of epistasis for fitness, was used to show that recombination 
is an evolvable trait IT721 . Yet, the theoretical analysis of the evolution of sex and recombination 
has been a thriving field since the 1970's 11731 . No doubt population geneticists have re-derived 
results well known in physics (e.g. Wright's calculation of rates of shift between adaptive peaks), 
but these are not usually published as new physics, and are typically studied for their biological 
implications. Nevertheless, physicists have also had a serious commitment to subjects mean- 
ingful to evolution. Significant works include those discussed in this article, clonal interference 
in asexuals |29l [32l [33l [381 . an application of percolation theory to speciation [74 1, extending 
Haldane's principle to a multilocus trait with partial dominance, epistasis and sexual reproduc- 
tion Il65ll66l . and ecological explanations of replicator dynamics ll75l l76ll . All these are aimed 
directly at a biological audience, published in appropriate journals. Generally, physicists often 
have a sharp intuition about their models, which greatly helps in finding solutions. 

Statistical physics is based on universal physical laws. In contrast, biological concepts are 
relative, plastic, or even arbitrary (e.g. mean fitness, traits). Hence the analogies with statistical- 
mechanical models are limited, depending on the nature of epistasis, physical linkage of the 
genes, unpredictable fluctuating selection, etc. Moreover, there are different ways in which 
precise analogies can be drawn, limiting their scope: some factors act deterministically (e.g. 
selection) and other stochastically (mutations or drift). 

Conclusions 

Many of the fundamental processes of both population genetics and statistical physics are 
described by diffusion. In evolution, it provides a common framework for features such as the 
change in allele frequencies Il20ll77l . genealogies [78 1, and spatial dispersal ll36l . All these, and 
others, can benefit from methods of non-equilibrium statistical mechanics, which is a major and 
active field in physics. 

The concept of a path ensemble is especially useful, shifting the paradigm from tracking fre- 
quencies at each point in time, to considering selection over the whole history of the alleles 
ll78l . This can be applied to both, deterministic [79] and stochastic evolution E4l . In turn, 
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long-standing questions about the efficiency of natural selection in building complex phenotypes 
EEJED, and evolution under fluctuating selection, can be re-addressed. 

Of course, we can ask whether the mathematical paraphernalia that we advocate is of any 
practical use. Although we should not take mathematical models too literally, they are useful both 
for generating hypotheses about evolution, and for making sense of ecological and genetic data. 
Most notably, the neutral theory provides the conceptual framework for analyses of sequence data 
El . and quantitative genetics predicts the effects of selection on complex traits 1 80 1 . Ideas from 
statistical mechanics may help by providing new ways to describe the evolution of complex traits, 
and by suggesting constraints on the efficacy of selection. A clearer understanding of concepts 
such as fitness flux and entropy suggest new ways to think about the evolution of quantitative 
traits. To understand adaptation, we need to contemplate not only the current state of populations, 
but also their history. This is of course an old idea, but the rationale that we review, suggest new 
ways to understand the process of adaptation in a historical and quantitative way. 
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Glossary 



Boltzmann distribution A probability measure of the microscopic states of a physical system 
that is composed of classical (i.e. not quantum) particles in thermodynamic equilibrium. 
This distribution has a density proportional to the factor e\p(-E/kT), where E is the energy 
of a state, k is Boltzmann's constant, and T is the absolute temperature. 

Detailed balance An equilibrium where the probability flux of the transitions between any two 
states is equal in either direction. In population genetics this implies that the numbers of 
adaptive and deleterious substitutions have to be equal on average. 

Entropy A measure of the number of possible configurations of a system. The classical measure 
of entropy is due to Boltzmann: S = — k log Q, where Q. is the number (or density) of 
microscopic states (e.g. allele frequencies) that a system can realize for a given macroscopic 
state (mean fitness, a quantitative variable, etc.) and k is Boltzmann's constant. Relative 
entropy is defined as S — - J if/\og(if//ip)dp, where the p are the microscopic states, and 
the sum goes over all possible realizations; \jj is the distribution of micro-states, and <p is a 
base or reference distribution (satisfying <p = 2NVg p ). However, when <p = const, we have 
Shannon's entropy, which is the form used in statistical physics. Entropy is also equivalent 
to the log-likelihood of tp (the proposed distribution), and iff is the sampling probability of 
the actual distribution. 



Fisher's information A measure of how much an infinitesimal change in an unknown parameter 
9 affects the likelihood iff of an observed data set, p. Fisher's information is defined as 

F = f iff(p; 9) \4g \og\\fr(p; 9)fj dp. When the parameter 9 is time, Fisher's information 
describes the amount of information gained through selection. 

Fitness flux A measure of adaptation defined as <p(t) = s(p, t)dp/dt, where s is the selection 
coefficient (fitness gradient) and p is the allelic frequency. Geometrically, it is the strength 
of fitness change (since s is the gradient of fitness, W), along the direction of evolution 
(given by dp/dt). The cumulative fitness flux, <t> = J <pdt, is a measure of the total amount 
of adaptation through the history of a population. 

Free fitness The expected gain in log-mean fitness after selection; after an analogy with the free 
energy of a physical system, that is the amount of work that can be done in a thermodynamic 
system. Free fitness (/) emerges naturally when computing the gain in entropy S after an 
allele or a trait underwent selection [48], and has an equivalent expression to free energy, 
i.e. / = (log(W)) - S /2N (in physics <log( W)) should be replaced by <£> , and 2N by 1 fkT; 
see entry for Boltzmann distribution). 

Hill-Robertson interference Interference in the selective sweep of an allele, due to the selec- 
tive effects at another linked loci. Hill-Robertson interference implies that in the presence 
of recombination, genotypes with multiple mutations arise easier by recombining existing 
single mutations than by multiple mutation events. 

Path ensemble A formalism of non-equilibrium statistical mechanics and quantum mechanics 
where the description of the system emphasizes not the states of a population of entities, but 
rather the distribution of possible stochastic paths that such a population can follow. 
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Quasispecie Population of replicators (typically asexual) with a high genotypic variability main- 
tained by elevated mutation rates. 

Replicator dynamics Dynamical equations that describe the change in time the frequency p of 
the different types (in particular genotypes). It has the general form dp/dt = pAW + T , 
where AW is the difference between the fitness of the type and the mean fitness, and T are 
the "transmission" terms, that may involve mutation, migration, recombination, etc. 

Selective death Failure to survive or reproduce due to differences in genotype. 

Stationary distribution A probability distribution that does not change in time. This is found 
from the diffusion equation by setting difr/dt = , and solving the resulting differential 
equation that is independent of time. A stationary solution might not exist (e.g. if selection 
is changing in time in particular ways), and if it exists, it might require detailed balance. 

Statistical mechanics A mathematical framework explaining the relationship between the 
macroscopic properties of a system, in terms of the dynamics of the microscopic variables. 
At equilibrium, it leads to the classical concepts of entropy, free energy, and temperature, 
for example. Out of equilibrium, these quantities cannot be defined formally, and current 
research focuses in finding probabilistic measures that apply in general, but are still based 
on the microscopic dynamics. Based principally on the properties of stochastic processes 
(e.g. the diffusion equations, or path ensembles), these measures can be applied to the 
distribution of allele frequencies (e.g. Fisher's information and fitness flux). 

Traveling waves Solutions to non-linear differential equations characterized by functions that 
are of stable shape, and move at a certain velocity either in physical space, or in genetic 
space. (Traveling waves are also known as solitons in the physics and mathematics litera- 
ture.) 

Truncation selection Scheme where individuals that have traits outside a prescribed range are 
eliminated. This type of selection is popular in artificial selection. 
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Appendix A. Evolution and the material basis of heredity 



Early in the last century, Fisher embarked on the mathematical formalization of the Mendelian 
principles of heredity, following the earlier development of biometrics by Galton, Pearson and 
Weldon, all with the aim of quantifying evolution by natural selection. Fisher used the diffusion 
approximation (see Appendix B I to describe the evolution of allele frequencies [ 18 1. In 1929, he 
introduced the Fundamental Theorem of Natural Selection iTTTl ; comparing it to the second law 
of thermodynamics, the increase of entropy, he intended the theorem to be an exact result, a "bi- 
ological law" [81, 82 1 . Although his comparison with the second law is flawed, it shows how the 
quantitative approach to heredity was influenced by statistical thermodynamics (indeed, Fisher 
had studied with E.T. Jeans, a physicist). That mechanistic basis of evolution, population ge- 
netics, was formulated without knowledge of the physical nature of the Mendelian genes, which 
was still unknown in the 1930's: the structure of DNA was not established until 1953. In the 
following decades, Delbriick, formerly an astrophysicist, started a collaboration employing ion- 
izing radiation on Drosophila to understand the physical nature of the genes, as a working system 
to try to identify fundamental physical laws that would account for living and non-living matter 
Il83l . Later, the ingenious Luria-Delbriick experiment proved the basis of Darwinian evolution. 
They performed a statistical comparison between the number of bacteria developing resistance to 
lysogenic viruses and its expected distribution, which was derived from a mathematical analysis, 
an unusual quantitative approach for the biologists of the time [84|. Soon after, the quantum 
theorist Schrodinger published What is life? [85 1, posing fundamental biological questions in 
physicists' language - partly based on Delbriick's discoveries. This book gave strong motivation 
to the first molecular biologists (among them Perutz, Wilkins, Crick and Watson) to find how 
DNA transmitted the heritable information to future generations [86|. Molecular biology was 
influenced in large part by the use of physical techniques such as X-ray crystallography to deter- 
mine biological structures. Evolution, however, while resting on that material basis of DNA, is 
not explained by it. Indeed, the population genetic framework that we use today was developed 
prior to the discovery of the structure of DNA, and was not changed by the establishment of 
molecular biology, since it rests only on Mendels laws. However, the theoretical methods that 
are common to statistical physics and to evolutionary biology give a deeper understanding of the 
evolutionary consequences of heredity. 



Appendix B. The Diffusion Equation 

The diffusion equation originated in Bachelier's models of fluctuations in share prices in 1900, 
and was rediscovered 73 years later as the Black-Scholes formula, disastrously popular amongst 
economists. Diffusion theory in economics is equivalent to the theory of Brownian motion, de- 
vised by Einstein to explain random molecular collisions, and soon after extended in physical 
applications |87ll . Fisher Ifl8l compared Mendelian genetics to "the theory of gases" and intro- 
duced the diffusion methods for the allele frequencies. Kolmogorov [88 1 gave a more formal 
approach to selection and drift. Kimura 1 89 1 later extended this formalism to non-equilibrium 
cases. For population genetics, the diffusion equation is a rather convenient representation of 
evolution of finite populations where genetic drift is present. We could choose to model the 
change in allele frequencies directly, in what is known a Wright-Fisher process. But genetic drift 
evolves stochastically, making the outcomes of evolution unpredictable. The diffusion equation 
gauges these outcomes in a probabilistic way, describing the distribution of allele frequencies 
at each time. (A third way to describe an evolving populations is to use the whole history as a 
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random variable instead of the allele frequencies at a time, in what is known as a path ensemble; 
see Appendix C I. In short, the diffusion equation is a partial differential equation describing the 
change in time of the probability density \p of the allele frequencies p, namely 



dip d Id 2 

where M$ p are the deterministic factors, due to selection, mutation, migration, etc. and Vg p is the 
variance of the fluctuations by drift usually of the form p(l - p)/2N. Making the left-hand side 
equal to zero, leads to the stationary solution derived by Wright by other means [16]: 

-A = CW 2N [p(l - p)] 4 ^" 1 . 



For details of the derivation see [89]. In particular, the term W defines the "fitness landscape" 
which can be thought as a surface in the space of allele frequency (Fig. [TJ, or in the quantitative 
variables (Fig. EJ. 

The diffusion equation, the coalescent process, and the path ensemble all describe the same 
process and are mathematically equivalent. Each has different advantages and limitations; 
whereas a stochastic differential equation, the diffusion equation and the path ensemble do not 
require detailed balance, the stationary distribution above, does. Yet, this solution is exact, quite 
general and relatively simple. 



Appendix C. Path ensembles and fitness flux 

A path ensemble considers all the possible histories of a population between two fixed 
states po and pj at times and T. In this description, each history is the variable be- 
ing described. The probability of a particular trajectory p(f) is proportional to the factor 

exp —N — Mgp) pn-p) > where the allele frequencies p are evaluated at each point of 
the history p(t). Here, M^ p is the same factor in the diffusion equation -suggesting the connec- 
tion between the two methods. The path integral can be understood as a sum if the history is 
sampled at discrete times, p = {po.Pi, ■ • ■ >Pr)- Notice that because the integral is always pos- 
itive, if it achieves a minimum for a given history, then that history has the highest probability. 
To understand the meaning of the integral we may develop the binomial expression inside the 
integral into three terms: F — 2<f> + v, and consider the case of selection, M# p = p(l - p)s ; F is 
Fisher's information, cp = ^ = is the fitness flux, and v is the additive genetic variance 
in fitness. Thus the histories occur as a compromise between minimizing Fisher's information 
and genetic variance -both regarded as measures of the speed of adaptation- and maximizing the 
fitness flux. 

Fitness flux is a measure of adaptation of beneficial alleles [54| the cumulative flux 
= j <pdij of a population history is the equivalent measure to the fitness of a population (if 
we think of successive substitutions, it is the total of all the selection coefficients associated with 
each substitution). The expectation of cumulative fitness flux is necessarily greater than the re- 
duction in entropy between the initial and final equilibrium states (which can be understood as the 
information gained by the population) [49]: 2N(<f>) > -AS . That is, it takes a certain amount of 
selection (measured precisely by the fitness flux) to move the allele frequency distribution away 
from its neutral state (as measured by the decrease in entropy). This result is quite generally valid, 
and is not restricted to, say, constant selection. Moreover, if selection changes slowly so that the 
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distribution stays close to the stationary state, then 2N{®} = —AS ; such changes are termed "re- 
versible". For example, assume that the allele frequencies initially follow a neutral distribution 
(mutation-drift balance). Suddenly, directional selection is applied so that W = exp(sp), and loci 
move toward a new distribution under selection and drift. The fitness flux is then substantially 
greater than the decrease in entropy (Fig. [3}. If, on the other hand, selection were increased 
very slowly, eventually to reach the same strength, the net fitness flux would necessarily be much 
smaller, and equal to the decrease in entropy (lower curve in Fig. [3]). The fitness flux method is 
surprisingly general. However, its relation with quantities that might actually constrain the extent 
of selection. In particular, the additive variance in fitness is proportional to s 2 p(l - p) ; we see 
that the additive genetic variance in fitness is just twice the fitness flux, when that includes only 
the change in allele frequency due to selection, A s p = sp{\ — p). Further understanding could 
emerge relating the decrease in entropy due to selection to the additive genetic variance in fitness 

EU). 

Last, it is relevant that fitness flux presents an extension of Fisher's Fundamental Theorem of 
Natural selection |7T): it considers not only the change due to selection, but also the effects of 
drift, and unlike Fisher's theorem, the fitness flux theorem holds also for weak selection (Ns ~ 1) 

El. 
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Figure 1 : In the solution to the diffusion equation, the effects of fitness (blue) combine with neutral factors (green) to give 
the distribution of allele frequencies (red). The Metropolis-Hastings algorithm has an analogous structure: the acceptance 
weights (blue) and the random fluctuations (green) combine to give the distribution that is being estimated (red). 




Figure 2: Mapping the genetic fitness landscape to a quantitative-trait fitness landscape. Left: different combinations of 
allele frequencies, lie in a hyperspace (shown only for a projection of 4 loci), where the axes represent the frequency of 
each allele. In this plot each point represents a population. The dense cloud of points towards the centre is an optimal 
peak, set at 0011. The other clouds are at sub-optimal adaptive peaks one mutation away from the optimum. However, 
each genotype determines a trait, and the population is mapped to a space of trait means, z, and genetic variance, v. Thus, 
mean fitness, trait mean, and genetic variance, although related by the allele frequencies, generate a fitness landscape 
in quantitative variables (yellow surface, the height indicating log-mean fitness). The number of variables (degrees of 
freedom) is collapsed from a hyperspace of an arbitrary number of allele frequencies at each locus to two quantitative 
variables: trait mean and genetic variance. 
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Figure 3: The top panel shows the distribution of allele frequencies through time (shown as contour levels). Initially, 
populations follow the neutral distribution (left axis; N/i = 0.7). Directional selection Ns = 2.5 is then applied, and 
populations settle to a new distribution (right axis). Any actual realization (red curve) fluctuates stochastically around an 
optimal one (illustrated with the green curve). The white dashed line is the deterministic solution, shown as a reference. 
The lower panel shows the fitness flux (upper curve, green) and the decrease in entropy (lower curve, red). When selection 
changes abruptly, as here, fitness flux is substantially greater than the decrease in entropy. However, if selection were to 
change slowly, the two would be equal throughout. 



18 



